[RISCV] add load/store misched/PostRA subtarget features #149409

danielhb · 2025-07-17T21:22:58Z

Some processors benefit more from store clustering than load clustering,
and vice-versa, depending on factors that are exclusive to each one
(e.g. macrofusions implemented).

Likewise, certain optimizations benefits more from misched clustering
than postRA clustering. Macrofusions are again an example: in a
processor with store pair macrofusions, like the veyron-v1, it is
observed that misched clustering increases the amount of macrofusions
more than postRA clustering. This of course isn't necessarily true for
other processors, but it shows that processors can benefit from a more
fine grained control of clustering mutations, and each one is able to do
it differently.

Add 4 new subtarget features that deprecates the existing
riscv-misched-load-store-clustering and riscv-postmisched-load-store-clustering
options:

disable-misched-load-clustering and disable-misched-store-clustering:
disable load/store clustering during misched;
disable-postmisched-load-clustering and disable-postmisched-store-clustering:
disable load/store clustering during PostRA.

Note that the new subtarget features disables specific stages of the default
clustering settings. The default per se (load and store clustering for both
misched and PostRA) is left untouched.

Disable all clustering but misched-store-clustering for the veyron-v1 processor
using the new features.

github-actions · 2025-07-17T21:23:13Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-07-17T21:23:43Z

@llvm/pr-subscribers-backend-risc-v

Author: Daniel Henrique Barboza (danielhb)

Changes

Hi,

This PR adds 4 new cluster options to fine tune how loads and stores are grouped during misched and postRA passes.

The work was motivated by macrofusion tests with the veyron-v1 processor model. This model has stored pair macrofusions but it doesn't do much with load pairs. Adding new options to turn off load clustering improved the macrofusions amount and performance.

The existing 2 cluster options added by @asb are kept because I'm afraid to break existing users that might be using them. If that' s not a concern we can remove them and keep the new options only.

Full diff: https://github.com/llvm/llvm-project/pull/149409.diff

1 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+34-8)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index b43b915d0ad4f..24ddd3e1fd1c7 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -99,11 +99,31 @@ static cl::opt<bool> EnableMISchedLoadStoreClustering(
     cl::desc("Enable load and store clustering in the machine scheduler"),
     cl::init(true));
 
+static cl::opt<bool> EnableMISchedLoadClustering(
+    "riscv-misched-load-clustering", cl::Hidden,
+    cl::desc("Enable load clustering in the machine scheduler"),
+    cl::init(true));
+
+static cl::opt<bool> EnableMISchedStoreClustering(
+    "riscv-misched-store-clustering", cl::Hidden,
+    cl::desc("Enable store clustering in the machine scheduler"),
+    cl::init(true));
+
 static cl::opt<bool> EnablePostMISchedLoadStoreClustering(
     "riscv-postmisched-load-store-clustering", cl::Hidden,
     cl::desc("Enable PostRA load and store clustering in the machine scheduler"),
     cl::init(true));
 
+static cl::opt<bool> EnablePostMISchedLoadClustering(
+    "riscv-postmisched-load-clustering", cl::Hidden,
+    cl::desc("Enable PostRA load clustering in the machine scheduler"),
+    cl::init(true));
+
+static cl::opt<bool> EnablePostMISchedStoreClustering(
+    "riscv-postmisched-store-clustering", cl::Hidden,
+    cl::desc("Enable PostRA store clustering in the machine scheduler"),
+    cl::init(true));
+
 static cl::opt<bool>
     EnableVLOptimizer("riscv-enable-vl-optimizer",
                       cl::desc("Enable the RISC-V VL Optimizer pass"),
@@ -301,10 +321,13 @@ ScheduleDAGInstrs *
 RISCVTargetMachine::createMachineScheduler(MachineSchedContext *C) const {
   ScheduleDAGMILive *DAG = createSchedLive(C);
   if (EnableMISchedLoadStoreClustering) {
-    DAG->addMutation(createLoadClusterDAGMutation(
-        DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
-    DAG->addMutation(createStoreClusterDAGMutation(
-        DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
+    if (EnableMISchedLoadClustering)
+      DAG->addMutation(createLoadClusterDAGMutation(
+          DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
+
+    if (EnableMISchedStoreClustering)
+      DAG->addMutation(createStoreClusterDAGMutation(
+          DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
   }
 
   const RISCVSubtarget &ST = C->MF->getSubtarget<RISCVSubtarget>();
@@ -318,10 +341,13 @@ ScheduleDAGInstrs *
 RISCVTargetMachine::createPostMachineScheduler(MachineSchedContext *C) const {
   ScheduleDAGMI *DAG = createSchedPostRA(C);
   if (EnablePostMISchedLoadStoreClustering) {
-    DAG->addMutation(createLoadClusterDAGMutation(
-        DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
-    DAG->addMutation(createStoreClusterDAGMutation(
-        DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
+    if (EnablePostMISchedLoadClustering)
+      DAG->addMutation(createLoadClusterDAGMutation(
+          DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
+
+    if (EnablePostMISchedStoreClustering)
+      DAG->addMutation(createStoreClusterDAGMutation(
+          DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
   }
 
   return DAG;

mgudim · 2025-07-18T14:52:42Z

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp

even though the old option is preserved, it now does nothing?

All added options are defaulted to 'true', similar to the 2 existing options. But the older options have precedence. In the commit msg I gave an example:

"To preserve the existing behavior of the existing flags all options added
are default 'true'. Note that the two existing flags have precedence, so
setting "riscv-misched-load-store-clustering=false" will disable load
mutations even if setting riscv-misched-load-clustering=true. Same thing
with stores and with the postRA options."

So riscv-misched-load-store-clustering=false will turn off misched clustering for both loads and stores, regardless of riscv-misched-(load/store)-clustering being set to 'true' in the same command line.

Perhaps a better wording would be "disabling takes precedence" because that's how the code works, e.g.:

riscv-misched-load-store-clustering=false will always disable both load and store clustering for misched;

riscv-misched-load-clustering=false will always disable load clustering for misched;

riscv-misched-store-clustering=false will always disable store clustering for misched.

I thought about inverting the semantics of the flag, i.e. adding options that would disable steps instead of enabling, but that didn't make the situation much better. What would make it better is to remove the existing 2 options, but then we'll have to deal with potential broken scripts.

OK, got it. Thanks!

mgudim

LGTM, but let's wait to see what others say if it is OK to remove the old options.

mshockwave

I'm generally in favor of removing the existing two options (i.e. riscv-misched-load-store-clustering and riscv-postmisched-load-store-clustering). This is not a Clang flag and LLVM flags change all the time.

Non blocking, but an alternative solution to avoid having two flags could be making -riscv-misched-load-store-clustering (and its post-RA counterpart) to take a list of "load" or "store" instead to enable either or both clusterings.

danielhb · 2025-07-18T17:24:02Z

Thanks for the reviews. Removing the original flags is a cleaner way of going about it so I just did that.

mshockwave

LGTM w/ minor comments

Please avoid force push unless necessary: https://llvm.org/docs/GitHub.html#rebasing-pull-requests-and-force-pushes

mshockwave · 2025-07-18T17:27:02Z

llvm/test/CodeGen/RISCV/misched-load-clustering.ll

could you add new tests checking the situation where only either of the flags is enabled?

Sure, I'll add two new tests disabling load and store clustering respectively

wangpc-pp

LGTM! Thanks!

mgudim

LGTM, thanks!

topperc · 2025-07-21T19:43:11Z

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp

Should these come from subtarget instead of command line?

Did I miss something? We don't have subtarget features for these?

That was my point. Should we add Subtarget features instead? The description of the PR suggests that whether this is profitable is CPU specific.

Yeah, make sense to me.

By Subtarget feature are we talking about a a feature, like FeatureUnalignedScalarMem, or a tune like TunePostRAScheduler (took the examples from sifive_p670)? Seems to me that this would be more like a tuning like PostRAScheduler.

Also, do we want to keep the default as is (enable all load/store clustering in both stages) and add features to disable it, or the other way around? Having everything disabled by default and adding subtarget features to enable what you want is cleaner, but we'll change behavior for everyone that doesn't add the new features in their target defs. Perhaps this is a good thing (we'll force people to make a decision instead of relying on defaults) ....

I just checked how TunePostRAScheduler is implemented and it has both a subtarget feature and also a flag (post-RA-scheduler) that can be used to override the target definition. I like this design because I give the flexibility of the command line (for debugging and whatnot) while also giving each processor a convenient way of setting clustering preferences.

If no one opposes I'll go this direction, but I guess we'll want to turn the default to 'false' and let each processor define what clustering they want

Changing the default settings of load/store clustering breaks 290+ tests in check-llvm. This shows that there's a LOT of code and logic built on top of the existing default. I don't want to touch this viper nest in this work so I'll keep the default as is. This means that the subtarget features will disable the default settings instead of enabling - TuneNoDefaultUnroll would be an example.

Also, after adding both the command line options and subtarget features I realized that I might be complicating things too much, given that we can override the subtarget features via "mattr=-" anyway, so I'll change the code to send just the subtarget features. I'll send a new version shortly.

mgudim · 2025-07-24T15:30:29Z

llvm/lib/Target/RISCV/RISCVFeatures.td

I think @topperc meant to mimic something like "bool RISCVSubtarget::useAA()" for example

No I meant something we could set per-CPU like this.

wangpc-pp · 2025-07-25T02:32:49Z

llvm/lib/Target/RISCV/RISCVFeatures.td

Suggested change

"NoMISchedLoadClustering", "true", "Disable load clustering in the machine scheduler">;

"MISchedLoadClustering", "false", "Disable load clustering in the machine scheduler">;

Negate the enum and the value. See also TuneNoDefaultUnroll and TuneNoSinkSplatOperands.

You can add Enable before the enum, like EnableMISchedLoadClustering. So that the auto-generated methods will be like enableXXX().

That explains why I didn't find an enableDefaultUnroll() method ...

wangpc-pp · 2025-07-25T10:59:00Z

llvm/lib/Target/RISCV/RISCVSubtarget.h

I think we don't need these because there are auto-generated methods.

wangpc-pp

LGTM.

topperc

LGTM

danielhb · 2025-08-06T11:23:13Z

Ping

Do we require more reviews or is this good to go? Also, I don't have write access (this is my first PR for LLVM) so I'm afraid someone will have to do the honors merging this one. I'm not sure if a rebase is needed due to the conflict being pointed out too ...

danielhb · 2025-08-06T13:25:16Z

Updating the branch to fix conflicts

topperc · 2025-08-06T15:11:58Z

Ping

Do we require more reviews or is this good to go? Also, I don't have write access (this is my first PR for LLVM) so I'm afraid someone will have to do the honors merging this one. I'm not sure if a rebase is needed due to the conflict being pointed out too ...

You have enough reviews. The reviewers probably didn't think about you not having write permission.

Some processors benefit more from store clustering than load clustering, and vice-versa, depending on factors that are exclusive to each one (e.g. macrofusions implemented). Likewise, certain optimizations benefits more from misched clustering than postRA clustering. Macrofusions are again an example: in a processor with store pair macrofusions, like the veyron-v1, it is observed that misched clustering increases the amount of macrofusions more than postRA clustering. This of course isn't necessarily true for other processors, but it shows that processors can benefit from a more fine grained control of clustering mutations, and each one is able to do it differently. Add 4 new clustering options that deprecates the existing riscv-misched-load-store-clustering and riscv-postmisched-load-store-clustering options: - riscv-misched-load-clustering and riscv-misched-store-clustering: enable/disable load/store clustering during misched; - riscv-postmisched-load-clustering and riscv-postmisched-store-clustering: enable/disable load/store clustering during PostRA. To preserve the existing clustering behavior all 4 options are defaulted to 'true'.

- add store-clustering=false to the existing LDCLUSTER test; - add a new STCLUSTER test with load-clustering=false; - add a new DEFAULTCLUSTER test with default clustering options.

Added 4 subtarget features that disables the default cluster settings: - TuneDisableMISched(Load|Store)Clustering - TuneDisablePostMISched(Load|Store)Clustering The 4 command line options added previously were removed. Use the new subtarget features in Ventana veyron-v1 to allow only misched store clustering. Test changes: - add a store clustering test: misched-store-clustering.ll - add subtarget features tests in misched-load-clustering.ll and misched-store-clustering.ll - misched-mem-clustering.mir: use the subtarget features

Instead of using NoMISchedLoadClustering and setting it to 'true' using the feature, rename it to NoMISchedLoadClustering and make the feature set it to 'false'. Rename the subtarget method disableMISchedLoadClustering() to enableMISchedLoadClustering() to remove the negative check in createMachineScheduler() when enabling the clustering. Do the same with the other 3 subtarget features.

We can get the existing enableXXClustering methods to be autogenerated by prefixing the enums with 'Enable', e.g. 'EnableMISchedLoadClustering', and an autogenerated 'enableMISchedLoadClustering' method will be created.

danielhb · 2025-08-06T15:48:08Z

Ping
Do we require more reviews or is this good to go? Also, I don't have write access (this is my first PR for LLVM) so I'm afraid someone will have to do the honors merging this one. I'm not sure if a rebase is needed due to the conflict being pointed out too ...

You have enough reviews. The reviewers probably didn't think about you not having write permission.

Got it. I asked @mgudim for help.

github-actions · 2025-08-06T16:08:43Z

@danielhb Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

mshockwave · 2025-08-06T16:09:09Z

Ping
Do we require more reviews or is this good to go? Also, I don't have write access (this is my first PR for LLVM) so I'm afraid someone will have to do the honors merging this one. I'm not sure if a rebase is needed due to the conflict being pointed out too ...

You have enough reviews. The reviewers probably didn't think about you not having write permission.

Got it. I asked @mgudim for help.

I just merged it on your behalf

llvmbot added the backend:RISC-V label Jul 17, 2025

mgudim reviewed Jul 18, 2025

View reviewed changes

mgudim requested a review from topperc July 18, 2025 14:53

mgudim approved these changes Jul 18, 2025

View reviewed changes

mshockwave reviewed Jul 18, 2025

View reviewed changes

danielhb force-pushed the load_store_cluster_opts branch from 1576ca9 to 9de5329 Compare July 18, 2025 17:21

danielhb requested a review from mgudim July 18, 2025 17:24

mshockwave approved these changes Jul 18, 2025

View reviewed changes

wangpc-pp approved these changes Jul 19, 2025

View reviewed changes

mgudim approved these changes Jul 21, 2025

View reviewed changes

topperc reviewed Jul 21, 2025

View reviewed changes

danielhb changed the title ~~[RISCV] add load/store misched/PostRA cluster options~~ [RISCV] add load/store misched/PostRA subtarget features Jul 24, 2025

danielhb requested review from mshockwave and wangpc-pp July 24, 2025 13:57

mgudim reviewed Jul 24, 2025

View reviewed changes

wangpc-pp reviewed Jul 25, 2025

View reviewed changes

wangpc-pp approved these changes Jul 28, 2025

View reviewed changes

topperc approved these changes Jul 28, 2025

View reviewed changes

danielhb force-pushed the load_store_cluster_opts branch from 8561859 to 170d0ae Compare August 6, 2025 13:25

danielhb added 4 commits August 6, 2025 08:19

misched-load-clustering.ll: exercise new clustering flags

63e9d80

- add store-clustering=false to the existing LDCLUSTER test; - add a new STCLUSTER test with load-clustering=false; - add a new DEFAULTCLUSTER test with default clustering options.

Remove 'enableXXXClustering' methods from RISCVSubtarget.h

a743f80

We can get the existing enableXXClustering methods to be autogenerated by prefixing the enums with 'Enable', e.g. 'EnableMISchedLoadClustering', and an autogenerated 'enableMISchedLoadClustering' method will be created.

danielhb force-pushed the load_store_cluster_opts branch from 170d0ae to a743f80 Compare August 6, 2025 15:30

mshockwave merged commit 8e57689 into llvm:main Aug 6, 2025
5 checks passed

	"NoMISchedLoadClustering", "true", "Disable load clustering in the machine scheduler">;
	"MISchedLoadClustering", "false", "Disable load clustering in the machine scheduler">;

[RISCV] add load/store misched/PostRA subtarget features #149409

[RISCV] add load/store misched/PostRA subtarget features #149409

Uh oh!

Conversation

danielhb commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 17, 2025

Uh oh!

llvmbot commented Jul 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mgudim left a comment

Choose a reason for hiding this comment

Uh oh!

mshockwave left a comment

Choose a reason for hiding this comment

Uh oh!

danielhb commented Jul 18, 2025

Uh oh!

mshockwave left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

mgudim left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

danielhb commented Aug 6, 2025

Uh oh!

danielhb commented Aug 6, 2025

Uh oh!

topperc commented Aug 6, 2025

Uh oh!

danielhb commented Aug 6, 2025

Uh oh!

Uh oh!

github-actions bot commented Aug 6, 2025

danielhb commented Jul 17, 2025 •

edited

Loading